home *** CD-ROM | disk | FTP | other *** search
-
-
-
- PPPPCCCCRRRREEEE((((3333)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV PPPPCCCCRRRREEEE((((3333))))
-
-
-
- NNNNAAAAMMMMEEEE
- pcreposix - POSIX API for Perl-compatible regular
- expressions.
-
- SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
- ####iiiinnnncccclllluuuuddddeeee <<<<ppppccccrrrreeeeppppoooossssiiiixxxx....hhhh>>>>
-
- iiiinnnntttt rrrreeeeggggccccoooommmmpppp((((rrrreeeeggggeeeexxxx____tttt ****_p_r_e_g, const char *_p_a_t_t_e_r_n,
- iiiinnnntttt _c_f_l_a_g_s);
-
- iiiinnnntttt rrrreeeeggggeeeexxxxeeeecccc((((rrrreeeeggggeeeexxxx____tttt ****_p_r_e_g, const char *_s_t_r_i_n_g,
- ssssiiiizzzzeeee____tttt _n_m_a_t_c_h, regmatch_t _p_m_a_t_c_h[], int _e_f_l_a_g_s);
-
- ssssiiiizzzzeeee____tttt rrrreeeeggggeeeerrrrrrrroooorrrr((((iiiinnnntttt _e_r_r_c_o_d_e, const regex_t *_p_r_e_g,
- cccchhhhaaaarrrr ****_e_r_r_b_u_f, size_t _e_r_r_b_u_f__s_i_z_e);
-
- vvvvooooiiiidddd rrrreeeeggggffffrrrreeeeeeee((((rrrreeeeggggeeeexxxx____tttt ****_p_r_e_g);
-
-
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- This set of functions provides a POSIX-style API to the PCRE
- regular expression package. See ppppccccrrrreeee ((((3333)))) for a description
- of the native API, which contains additional functionality.
- The functions described here are just wrapper functions that
- ultimately call the native API.
-
- As I am pretty ignorant about POSIX, these functions must be
- considered as experimental. I have implemented only those
- option bits that can be reasonably mapped to PCRE native
- options. Other POSIX options are not even defined. It may be
- that it is useful to define, but ignore, other options.
- Feedback from more knowledgeable folk may cause this kind of
- detail to change.
-
- When PCRE is called via these functions, it is only the API
- that is POSIX-like in style. The syntax and semantics of the
- regular expressions themselves are still those of Perl,
- subject to the setting of various PCRE options, as described
- below.
-
- The header for these functions is supplied as ppppccccrrrreeeeppppoooossssiiiixxxx....hhhh to
- avoid any potential clash with other POSIX libraries. It
- can, of course, be renamed or aliased as rrrreeeeggggeeeexxxx....hhhh, which is
- the "correct" name. It provides two structure types, _r_e_g_e_x__t
- for compiled internal forms, and _r_e_g_m_a_t_c_h__t for returning
- captured substrings. It also defines some constants whose
- names start with "REG_"; these are used for setting options
- and identifying error codes.
-
-
-
-
-
-
- Page 1 (printed 12/10/98)
-
-
-
-
-
-
- PPPPCCCCRRRREEEE((((3333)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV PPPPCCCCRRRREEEE((((3333))))
-
-
-
- CCCCOOOOMMMMPPPPIIIILLLLIIIINNNNGGGG AAAA PPPPAAAATTTTTTTTEEEERRRRNNNN
- The function rrrreeeeggggccccoooommmmpppp(((()))) is called to compile a pattern into
- an internal form. The pattern is a C string terminated by a
- binary zero, and is passed in the argument _p_a_t_t_e_r_n. The _p_r_e_g
- argument is a pointer to a regex_t structure which is used
- as a base for storing information about the compiled
- expression.
-
- The argument _c_f_l_a_g_s is either zero, or contains one or more
- of the bits defined by the following macros:
-
- REG_ICASE
-
- The PCRE_CASELESS option is set when the expression is
- passed for compilation to the native function.
-
- REG_NEWLINE
-
- The PCRE_MULTILINE option is set when the expression is
- passed for compilation to the native function.
-
- The yield of rrrreeeeggggccccoooommmmpppp(((()))) is zero on success, and non-zero
- otherwise. The _p_r_e_g structure is filled in on success, and
- one member of the structure is publicized: _r_e__n_s_u_b contains
- the number of capturing subpatterns in the regular
- expression. Various error codes are defined in the header
- file.
-
-
-
- MMMMAAAATTTTCCCCHHHHIIIINNNNGGGG AAAA PPPPAAAATTTTTTTTEEEERRRRNNNN
- The function rrrreeeeggggeeeexxxxeeeecccc(((()))) is called to match a pre-compiled
- pattern _p_r_e_g against a given _s_t_r_i_n_g, which is terminated by
- a zero byte, subject to the options in _e_f_l_a_g_s. These can be:
-
- REG_NOTBOL
-
- The PCRE_NOTBOL option is set when calling the underlying
- PCRE matching function.
-
- REG_NOTEOL
-
- The PCRE_NOTEOL option is set when calling the underlying
- PCRE matching function.
-
- The portion of the string that was matched, and also any
- captured substrings, are returned via the _p_m_a_t_c_h argument,
- which points to an array of _n_m_a_t_c_h structures of type
- _r_e_g_m_a_t_c_h__t, containing the members _r_m__s_o and _r_m__e_o. These
- contain the offset to the first character of each substring
- and the offset to the first character after the end of each
- substring, respectively. The 0th element of the vector
-
-
-
- Page 2 (printed 12/10/98)
-
-
-
-
-
-
- PPPPCCCCRRRREEEE((((3333)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV PPPPCCCCRRRREEEE((((3333))))
-
-
-
- relates to the entire portion of _s_t_r_i_n_g that was matched;
- subsequent elements relate to the capturing subpatterns of
- the regular expression. Unused entries in the array have
- both structure members set to -1.
-
- A successful match yields a zero return; various error codes
- are defined in the header file, of which REG_NOMATCH is the
- "expected" failure code.
-
-
-
- EEEERRRRRRRROOOORRRR MMMMEEEESSSSSSSSAAAAGGGGEEEESSSS
- The rrrreeeeggggeeeerrrrrrrroooorrrr(((()))) function maps a non-zero errorcode from
- either rrrreeeeggggccccoooommmmpppp or rrrreeeeggggeeeexxxxeeeecccc to a printable message. If _p_r_e_g is
- not NULL, the error should have arisen from the use of that
- structure. A message terminated by a binary zero is placed
- in _e_r_r_b_u_f. The length of the message, including the zero, is
- limited to _e_r_r_b_u_f__s_i_z_e. The yield of the function is the
- size of buffer needed to hold the whole message.
-
-
-
- SSSSTTTTOOOORRRRAAAAGGGGEEEE
- Compiling a regular expression causes memory to be allocated
- and associated with the _p_r_e_g structure. The function
- rrrreeeeggggffffrrrreeeeeeee(((()))) frees all such memory, after which _p_r_e_g may no
- longer be used as a compiled expression.
-
-
-
- AAAAUUUUTTTTHHHHOOOORRRR
- Philip Hazel <ph10@cam.ac.uk>
- University Computing Service,
- New Museums Site,
- Cambridge CB2 3QG, England.
- Phone: +44 1223 334714
-
- Copyright (c) 1998 University of Cambridge.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Page 3 (printed 12/10/98)
-
-
-
-